Throughout this document, hover over the numbered annotations to the right of code chunks to reveal detailed explanations and comments about the code. Where drop-down italicized text is present, expand by pressing on arrow to see code.
create_vector_file_paths <-function(directory_path) {# List all files in the given directory path files_to_import <- fs::dir_ls(path = directory_path)# Loop through the files and print each with an indexfor (i inseq_along(files_to_import)) {cat(i, "= ", files_to_import[i], "\n") }# Return the vector of file pathsreturn(files_to_import)}files_to_import <-create_vector_file_paths("data/raw")
The @iteratively-import-raw-data code chunk should only be ran once when raw data is updated because it takes long to execute. Therefore, run the @efficiently-load-raw-data code chunk instead to easily import up-to-date raw data.
Refer to the output of the files_to_import data object to ensure you are inputting the correct index value corresponding to the file path that needs to be loaded.
Efficiently import up-to-date raw data:
base::load(files_to_import[10])
Rename datasets:
We will always use snakecase when naming our data objects and functions (e.g., data_object_name or function_name()).
# Mergingclams_growth_biogeochem_vars_merged <-full_join(ksf_clams_growth_data_tidied, biogeochem_vars_merged, by ="date")
Oyster Growth Interpolated and Merged with Environmental Variables
oyster_growth_biogeochem_vars_merged <-full_join(ksf_oyster_cylinder_growth_data_tidied, biogeochem_vars_interp, by ="date")# Interpolating -- other option is to aggregate to weekly or monthly, but dates # are very mismatched to aggregate to monthly and dataset would be very small if # aggregated to monthlyoyster_growth_biogeochem_vars_interp <- oyster_growth_biogeochem_vars_merged %>%mutate(across(where(~is.numeric(.x) &&any(is.na(.x))),~na.approx(.x, na.rm =FALSE, rule =2))) %>%# relocate(date, round, location, depth, clams_color, clams_stage, .before = days_btwn_clams_sort) %>% arrange(date)
Processed Datasets
Export Tidied Datasets
Export tidied datasets to CSV into data/tidied folder:
Iterate the export_to_csv(df, df_name, dir_path) function over each dataframe. .x refers to the dataframe. .y refers to the name of the dataframe. These are passed to export_to_csv() function along with the desired directory path.
Export merged final data set into data/outputs folder.